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1. INTRODUCTION 

Entrained behavior coordinates, predicts, 
and modulates multi-scale rhythmic ges- 
tures with high spatio-temporal precision 
even as it shows flexible adaptation in 
response to perturbation (Clayton et al., 
2005; Altenmiiller et al., 2006; Phillips- 
SUver et al., 2010). The capacity for this 
split-second, multi-scale timing is often 
viewed as a highly-complex, specialized 
virtuosity that emerged in the forges of 
natural selection for evolutionary advan- 
tage (Mithen, 2005; Knoblich and Sebanz, 
2008; Merker et al, 2009). Entrainment 
has found compelling mathematical mod- 
els in the interaction of multiple dynamic 
oscillators (Large, 2010) and convincing 
neurological substrata in the electrophys- 
iological resonance patterns that support 
cognition (Nozaradan et al, 2011; Schaefer 
et al., 2011). Further, entrainment-based 
therapeutic interventions have been val- 
idated in both quantative (Thaut and 
Abiru, 2010) and qualitative (Aigen, 2008) 
studies. 

This paper aims to bolster the theoreti- 
cal case for the transformational potential 
of entrainment therapy by casting it in 
the framework of contemporary engineer- 
ing mathematics, in particular applying 
the concepts of change of basis, Fourier 
transform, and most importantly, the 
growing body of work on Joint Sparse 
Representation (JSR) (Bruckstein et al., 
2009). The paper aims to be a conceptual 
introduction in the hopes of reaching a 
wider audience that may want to make use 
of the relationship between entrainment 
and sparsity, and apply more engineering 
mathematics to their analyses of entrain- 
ment in therapy and performance. 



2. THREE KEY CONCEPTS 

2.1. CHANGE OF BASIS 

Many of the engineering marvels around 
us have, as a keystone of their math- 
ematical foundations, a change of basis 
(Kreyszig, 2007). A technical definition of 
a mathematical basis is a set of linearly 
independent vectors within a space that, in 
combination, can span the entirety of that 
space. For example, the Cartesian basis 
for three-dimensional real space (aka R3) 
is a set of three orthogonal (perpendicu- 
lar) unit (length of one) vectors, pointing 
along the x, y, and z axes respectively. In 
vector notation the orthonormal Cartesian 
_R3 vectors are [1 0 0], [0 1 0] and [0 0 
1]. We say these vectors span R3, as any 
point in _R3, for which we have the coor- 
dinates [x y z], can be reached from the 
origin using the vectors [x 0 0] -|- [0 y 0] 
+ [0 0 z]. The Cartesian _R3 basis is, in 
other words, the way we might account 
for spatial activity using rulers or graph 
paper. Add a fourth dimension for time, 
to span spatio-temporal activity, and the 
same rules apply for any vector [x y z t] . 

2.2. THE FOURIER BASIS AND THE 
FREQUENCY DOMAIN 

This spatio-temporal Cartesian basis is our 
most intuitive approach to representing 
the world around us, but also a very poor 
representation for solving many engineer- 
ing problems. One of the most commonly 
used changes of basis is the family of 
Fourier or frequency-domain transforms, 
in which a function is represented on a 
basis of sinusoidal periodic functions rather 
than units of Cartesian distance. In its dis- 
crete form, a signal is transformed from a 
series of consecutive sampled values into 



a combination of sinusoids of different 
amplitudes and frequencies. 

While conceptually cumbersome at 
first, Fourier transformation has many 
advantages for not only the analysis, but 
the storing and compression, of many kids 
of data. Take, for example, a sample of a 
single musical note, vibrating at a partic- 
ular frequency, that would appear on an 
oscilloscope as a complex periodic wave- 
form. In the time domain, this signal will 
be dense, that is, it will contain few if 
any zeros and most of the signal will be 
required for its reconstruction as speci- 
fied by the the Nyquist- Shannon sampling 
theorem (Shannon, 1949). If, however, the 
signal is like most signals coming from a 
musical instrument — a combination of a 
fundamental frequency and a small num- 
ber of overtone frequencies — then it can 
be represented in the Fourier domain with 
a small number of values, one for each 
component frequency, leaving the rest of 
the basis vectors at zero magnitude. The 
signal vector thus meets the mathemati- 
cal definition of sparse — most of its coef- 
ficients are zero — and its representation 
can be efficiently compressed, requiring far 
less data for its representation than the 
Nyquist theorem specifies. Figure lA illus- 
trates the relationship between a complex 
periodic waveform and its sparse Fourier 
transformation. 

Mathematically, a signal and its Fourier 
transform are one-to-one mappings. The 
frequency-domain representation of the 
signal is often much more efficient, how- 
ever, in the sense that far more of the signal 
information is packed into a small subset 
of the vectors that span the basis. JPEG 
(Skodras et al, 2001) and MPEG (Le Gall, 
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FIGURE 1 I (A) A dense periodic discrete signal may Inave a sparse 
representation wlien transformed into tlie Fourier domain (DCT-II 
transform). (B) Image of Edinburgh Castle, with Spatial (Cartesian 
basis), Discrete Cosine Transform (frequency domain basis) and 



Singular Value Decomposition (least-squares optimal basis) 
compression applied at decreasing compression rates. Source: Stuart 
Caie, CC BY 2.0 license. Reproduced grayscale with described 
modifications. 



1991) compression schemes, for example, 
discard well over 90% of the information 
within a signal in part by transforming that 
image into the frequency domain (DCT in 
the case of MPEG and Daubechies wavelet 
in the case of JPEG 2000) and eliminat- 
ing the many frequency bands of near- 
zero magnitude. The resulting compressed 
data formats still retain enough of the 
significant information to have become 
the lingua franca of images and music. 



respectively. An example of data compres- 
sion in the spatial frequency domain is 
seen in Figure IB. 

Frequency-domain transform is hard- 
wired into the anatomy of the cochlea, 
whose hair cells of varying stiffness res- 
onate with stimuli of specific frequencies, 
triggering action potentials via auditory 
transduction. The inner ear thus per- 
forms a frequency transform of incom- 
ing auditory information across a small 



temporal window, known in its simplest 
form as a short-time Fourier transform 
(STFT), though actual observed perfor- 
mance resembles a somewhat more com- 
plex transform known as time-frequency 
reassignment (Auger et al, 2013). 

2.3. SPARSE OVERCOMPLETE CODING 

If sufficient information about the sig- 
nal can be deduced from a small portion 
of a signal via a mathematical transform. 
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then the benefits to the actor are obvious. 
Both computationally and metabolically, 
the organism that can reduce processing 
demands by such a large amount can 
expect to reap benefits. If frequency- 
domain and similar bases yield such 
improvements in information coding effi- 
ciency, the key question for modeling neu- 
ral coding is to ask how that information 
might be coded to its optimum. 

The optimal basis for a signal in 
a least-squares sense is its Singular 
Value Decomposition (SVD) (Strang, 
2007). A comparison of spatio-temporal, 
frequency-domain and SVD data com- 
pression is shown in Figure IB. The 
frequency-domain images show many 
more features at each level of compres- 
sion than the Cartesian (nearest-neighbor) 
compression, while the SVD images show 
substantially more than either. 

However, the SVD of a single signal 
is not necessarily the sparsest representa- 
tion of that signal in the context of a set 
of signals such as that encoded in neural 
memory. Much greater compression can 
be attained through the re-use of com- 
mon basis vectors to transform many sig- 
nals. In this approach the process of neural 
memory is modeled as manipulation of 
a set of learned basis vectors known as 
a "dictionary," in which incoming signals 
are decomposed in the sparsest possible 
way using the atomic vectors or "atoms" 
that make up the dictionary (Rubinstein 
et al, 2010). This operation is non-linear 
but many efficient algorithms have been 
developed for sparse dictionary coding, 
primarily through LI -norm minimization 
(Donoho and Elad, 2003). The most effi- 
cient dictionary systems are found to be 
"sparse overcomplete," that is, they con- 
sist of many more basis vectors than nec- 
essary for the set of signals, but have 
great flexibility to maximize the sparsity 
with which an incoming signal is encoded 
(Bruckstein et al, 2009; Rubinstein et al., 
2010). 

Finally the atoms of the dictionary 
must adapt to the new signals in accor- 
dance with the principles of Hebbian 
and Bayesian learning. Efficient algorithms 
have been discovered for this process as 
well, whether the classic K-SVD (Aharon 
et al., 2006) or more recent parametric or 
multiscale dictionary updating algorithms 
(Rubinstein et al, 2010). 



Perhaps unsurprisingly, there is abun- 
dant experimental evidence for such sparse 
coding in human and animal brains 
(Olshausen and Field, 2004). Evidence 
supporting a sparse coding model has 
been found in studies of visual (Olshausen 
and Field, 1997; Vinje and Gallant, 2000), 
auditory (Hromadka et al, 2008), olfac- 
tory (Ito et al, 2008; Poo and Isaacson, 
2009), haptic (Jadhav et al, 2009; Crochet 
et al., 2011) and motor (Hahnloser et al, 
2002) processing. Sparse coding models 
relate to the neuroanatomical observation 
that progressive stages in signal process- 
ing have increasingly redundant amounts 
of neurons that each fire increasingly 
rarely (Olshausen and Field, 2004). This 
is no longer projected to lead to signal- 
specific "grandmother cells" but rather to 
a maximally sparse and overcomplete rep- 
resentation of the world given metabolic 
constraints. 

3. PUniNG IT ALL TOGETHER: THE 
SPARSITY OF ENTRAINMENT 

The sparsity argument for entrainment is 
then as follows: Phenomena that contain 
regularities are more efficiently encoded in 
the frequency domain. We can therefore 
expect that the optimal basis, such as that 
obtained through SVD, would be much 
more similar to the frequency-domain 
mapping of signal, by a common simi- 
larity measure such as tangent distance 
(Simard et al., 1998), than the spatiotem- 
poral mapping of the signal. Finally, over 
time we can expect the atoms in the brain's 
sparse overcomplete dictionary to mini- 
mize metabolic and computational costs 
by reconstructing signals along bases that 
are closer in tangent distance to the fre- 
quency domain than the spatiotemporal. 

Returning to the descriptions of 
entrainment in the literature, many of 
the characteristic behaviors found in 
entrainment can be accounted for with 
greater conceptual economy by apply- 
ing sparsity-related concepts. Entrained 
movement is not necessarily more skillful 
than rhythmically independent move- 
ment, but rather entrained movement 
is more efficiently coded and less compu- 
tationally demanding when projected onto 
a frequency- domain basis. Entrainments 
across multiple time scales (Large, 2010) 
can be represented sparsely when trans- 
formed, and therefore does not necessarily 



pose much more computational challenge 
than a single-scale behavior. Non- 
linear coupled oscillators, such as those 
hypothesized to underlie entrainment 
(Large, 2010), have been shown to be 
more efficiently coded and tracked in the 
frequency domain (Buchli et al., 2008; 
Orchard et al., 2013). Perceived persistence 
of rhythmic structures in the absence of 
updated information (Large and Palmer, 

2002) is explained by pursuit of the spars- 
est basis for the signal. Similarly, the error 
minimization driving predictive coding 
(Vuust et al., 2009) is accounted for by 
the least-squares optimization properties 
inherent in SVD diagonalization. Finally, 
the long tradition of fascinating studies 
showing that humans, while in communi- 
cation with each other, synchronize from 
head to toe (Condon and Ogston, 1966; 
Trevarthen, 1979; Bernieri et al, 1988; 
Couper-Kuhlen, 1993; Shockley et al, 

2003) is not necessarily describing a behav- 
ior of great sophistication as much as a 
process of economy: whatever information 
is being communicated between subjects 
is mapped internally, for each participant, 
onto a mathematical basis that has trans- 
formed space and time into multi-scale 
frequency. Entraining together allows this 
communication to take a more efficient 
form than when the subjects retain rhyth- 
mic independence. Entrainment is not 
virtuosity, it is sparsity. 

4. VALIDATING THE SPARSITY MODEL 

What experiment might validate the 
hypothesis that entrainment facilitates 
sparse coding? While we cannot observe 
information coding directly, we can 
observe behavior, and while we do not 
have access to the atomic dictionaries 
within a subject, we can determine the 
SVD of a subject's actions. The singular 
values of the SVD further provide an effec- 
tive measurement tool for how sparsely the 
information is encoded known as singu- 
lar value entropy (SVE). If most of the 
information is sparsely packed into a small 
number of basis vectors, the entropy of 
the singular value set will be low, as some 
vectors will have very high singular values 
and most will be very low. On the other 
hand, if the information is encoded less 
sparsely, the information wiU be spread 
diffusely among the basis vectors, increas- 
ing the entropy. If entrainment aids the 
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neurally coded JSR of a movement, than 
the distribution of values within the SVD 
of the behavior is likely to shift. In partic- 
ular, entropy of the singular values can be 
expected to decrease, with increasing dom- 
inance of the loadings of the first singular 
values. If SVE of the kinematic vectors of 
a behavior decreases while entrained, it 
may be taken as evidence for a cognitive 
re-mapping of the action. 

From this hypothesis for the cognitive 
impact of entrainment, a second hypothe- 
sis for entrainment-based therapy may be 
additionally derived: if the lasting result 
of a repeated entrainment-based interven- 
tion is a persistent shift in kinematic SVE 
of a behavior, even independent of the 
intervention, the SVE alteration is evi- 
dence of entrainment-driven neuroplastic 
change. 

5. CONCLUSION 

As the presence of the cochlea has 
long suggested to anatomists, and as 
neural coding theory now asserts, the 
brain is much more aligned to the fre- 
quency domain than our everyday, spatio- 
temporal accounts of the world might lead 
us to think. Consequently, the impact of 
entrainment-based instruction and ther- 
apy is likely much greater than that which 
can be forecasted by spatiotemporal anal- 
ysis of actions. Entrainment is everywhere; 
entrainment is powerful; but perhaps most 
importantly, entrainment is sparse. A spar- 
sity model of entrainment therapy sug- 
gests that entrainment therapy is much 
more than a way to scaffold the re- 
learning of movements: it is potentially 
one of the most powerful approaches to 
the changing of behavior in the contempo- 
rary repertoire. 
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